Topic Identification for Speech Without ASR

نویسندگان

  • Chunxi Liu
  • Jan Trmal
  • Matthew Wiesner
  • Craig Harman
  • Sanjeev Khudanpur
چکیده

Modern topic identification (topic ID) systems for speech use automatic speech recognition (ASR) to produce speech transcripts, and perform supervised classification on such ASR outputs. However, under resource-limited conditions, the manually transcribed speech required to develop standard ASR systems can be severely limited or unavailable. In this paper, we investigate alternative unsupervised solutions to obtaining tokenizations of speech in terms of a vocabulary of automatically discovered word-like or phoneme-like units, without depending on the supervised training of ASR systems. Moreover, using automatic phoneme-like tokenizations, we demonstrate that a convolutional neural network based framework for learning spoken document representations provides competitive performance compared to a standard bag-of-words representation, as evidenced by comprehensive topic ID evaluations on both single-label and multi-label classification tasks.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A comparison of normalization techniques applied to latent space representations for speech analytics

In the context of noisy environments, Automatic Speech Recognition (ASR) systems usually produce poor transcription quality which also negatively impact performance of speech analytics. Various methods have then been proposed to compensate the bad effect of ASR errors, mainly by projecting transcribed words in an abstract space. In this paper, we seek to identify themes from dialogues of teleph...

متن کامل

Techniques for rapid and robust topic identification of conversational telephone speech

In this paper, we investigate the impact of automatic speech recognition (ASR) errors on the accuracy of topic identification in conversational telephone speech. We present a modified TF-IDF feature weighting calculation that provides significant robustness under various recognition error conditions. For our experiments we take conversations from the Fisher corpus to produce 1-best and lattice ...

متن کامل

A Novel Algorithm of Sparse Representations for Speech Compression/Enhancement and Its Application in Speaker Recognition System

This paper proposes sparse and redundancy representation spectral domain compression of the speech signal using novel sparsing algorithms to the problem of speech compression (SC)/enhancement (SE). In Automatic Speaker Recognition (ASR) sparsification can play a major role to resolve big data issues in speech compression and its storage in the database, where the speech signal can be uncompress...

متن کامل

Web-based language modelling for automatic lecture transcription

Universities have long relied on written text to share knowledge. As more lectures are made available on-line, these must be accompanied by textual transcripts in order to provide the same access to information as textbooks. While Automatic Speech Recognition (ASR) is a cost-effective method to deliver transcriptions, its accuracy for lectures is not yet satisfactory. One approach for improving...

متن کامل

Automatic Topic Identification for Large Scale Language Modeling Data Filtering

The paper presents a module for topic identification that is embedded into a complex system for acquisition and storing large volumes of text data from the Web. The module processes each of the acquired data items and assigns keywords to them from a defined topic hierarchy that was developed for this purposes and is also described in the paper. The quality of the topic identification is evaluat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017